Goto

Collaborating Authors

 detect object


Learning to Detect Objects with a 1 Megapixel Event Camera

Neural Information Processing Systems

Thanks to these characteristics, event cameras are particularly suited for scenarios with high motion, challenging lighting conditions and requiring low latency. However, due to the novelty of the field, the performance of event-based systems on many vision tasks is still lower compared to conventional frame-based solutions. The main reasons for this performance gap are: the lower spatial resolution of event sensors, compared to frame cameras; the lack of large-scale training datasets; the absence of well established deep learning architectures for event-based processing. In this paper, we address all these problems in the context of an event-based object detection task. First, we publicly release the first high-resolution large-scale dataset for object detection.


MetaAnchor: Learning to Detect Objects with Customized Anchors

Neural Information Processing Systems

We propose a novel and flexible anchor mechanism named MetaAnchor for object detection frameworks. Unlike many previous detectors model anchors via a predefined manner, in MetaAnchor anchor functions could be dynamically generated from the arbitrary customized prior boxes. Taking advantage of weight prediction, MetaAnchor is able to work with most of the anchor-based object detection systems such as RetinaNet. Compared with the predefined anchor scheme, we empirically find that MetaAnchor is more robust to anchor settings and bounding box distributions; in addition, it also shows the potential on the transfer task. Our experiment on COCO detection task shows MetaAnchor consistently outperforms the counterparts in various scenarios.


Review for NeurIPS paper: Learning to Detect Objects with a 1 Megapixel Event Camera

Neural Information Processing Systems

Weaknesses: - I believe that the details given in this work are detailed enough to reproduce the experimental results in the paper. The used neural network layers/functions are well established and the training schedule looks clean. However, I would like to encourage the authors to publish their code. However all components are well chosen from previous published work. Examples include the representation of events based on [48, 22, 49, 23], keeping temporal state from [39] or the detector head from [37].


Learning to Detect Objects with a 1 Megapixel Event Camera

Neural Information Processing Systems

Thanks to these characteristics, event cameras are particularly suited for scenarios with high motion, challenging lighting conditions and requiring low latency. However, due to the novelty of the field, the performance of event-based systems on many vision tasks is still lower compared to conventional frame-based solutions. The main reasons for this performance gap are: the lower spatial resolution of event sensors, compared to frame cameras; the lack of large-scale training datasets; the absence of well established deep learning architectures for event-based processing. In this paper, we address all these problems in the context of an event-based object detection task. First, we publicly release the first high-resolution large-scale dataset for object detection.


Reviews: MetaAnchor: Learning to Detect Objects with Customized Anchors

Neural Information Processing Systems

Summary: This paper proposes MetaAnchor, which is a anchor mechanism for object detection. In MetaAnchor, anchor functions are dynamically generated from anchor box bi, which describes the common properties of object boxes associated with i_th bin. It introduces a anchor function generator which maps any bounding box prior bi to the corresponding anchor function. In this paper, the anchor function generator is modeled as two-layer network for residual term R, added to the shared and learnable parameters for the anchor function theta *. The residual term R can also depends on input feature x, which introduces the data-dependent variant of anchor function generator. Using weight prediction mechanism, anchor function generator could be implemented and embedded into existing object detection frameworks for joint optimization.


The Use of Multimodal Large Language Models to Detect Objects from Thermal Images: Transportation Applications

Ashqar, Huthaifa I., Alhadidi, Taqwa I., Elhenawy, Mohammed, Khanfar, Nour O.

arXiv.org Artificial Intelligence

The integration of thermal imaging data with Multimodal Large Language Models (MLLMs) constitutes an exciting opportunity for improving the safety and functionality of autonomous driving systems and many Intelligent Transportation Systems (ITS) applications. This study investigates whether MLLMs can understand complex images from RGB and thermal cameras and detect objects directly. Our goals were to 1) assess the ability of the MLLM to learn from information from various sets, 2) detect objects and identify elements in thermal cameras, 3) determine whether two independent modality images show the same scene, and 4) learn all objects using different modalities. The findings showed that both GPT-4 and Gemini were effective in detecting and classifying objects in thermal images. Similarly, the Mean Absolute Percentage Error (MAPE) for pedestrian classification was 70.39% and 81.48%, respectively. Moreover, the MAPE for bike, car, and motorcycle detection were 78.4%, 55.81%, and 96.15%, respectively. Gemini produced MAPE of 66.53%, 59.35% and 78.18% respectively. This finding further demonstrates that MLLM can identify thermal images and can be employed in advanced imaging automation technologies for ITS applications.


Meta shares AI model that can detect objects it hasn't seen before

Engadget

AI normally needs to be trained on existing material to detect objects, but Meta has a way for the technology to spot items without help. The social media giant has published a "Segment Anything" AI model that can detect objects in pictures and videos even if they weren't part of the training set. You can select items by clicking them or using free-form text prompts. As Reuters explains, you can type the word "cat" and watch the AI highlight all the felines in a given photo. The model can also work in tandem with other models.


AI System Can Detect Objects Around Corners – NVIDIA Developer News Center

#artificialintelligence

To help autonomous vehicles and robots potentially spot objects that lie just outside a system's direct line-of-sight, Stanford, Princeton, Rice, and Southern Methodist universities researchers developed a deep learning-based system that can detect objects, including words and symbols, around corners. "Compared to other approaches, our non-line-of-sight imaging system provides uniquely high resolutions and imaging speeds," said Stanford University's Chris Metzler, on the Rice University post, Cameras see around corners in real time with deep learning. "These attributes enable applications that wouldn't otherwise be possible," he added. To achieve this, the system relies on a laser that can capture detailed images of objects around corners in real time. Specifically, a light from a high-speed laser is beamed onto a wall, the light from the hidden area bounces back to the wall, and that light is reflected to a camera.


MetaAnchor: Learning to Detect Objects with Customized Anchors

Yang, Tong, Zhang, Xiangyu, Li, Zeming, Zhang, Wenqiang, Sun, Jian

Neural Information Processing Systems

We propose a novel and flexible anchor mechanism named MetaAnchor for object detection frameworks. Unlike many previous detectors model anchors via a predefined manner, in MetaAnchor anchor functions could be dynamically generated from the arbitrary customized prior boxes. Taking advantage of weight prediction, MetaAnchor is able to work with most of the anchor-based object detection systems such as RetinaNet. Compared with the predefined anchor scheme, we empirically find that MetaAnchor is more robust to anchor settings and bounding box distributions; in addition, it also shows the potential on the transfer task. Our experiment on COCO detection task shows MetaAnchor consistently outperforms the counterparts in various scenarios.


How to Detect Objects with Deep Learning on Raspberry Pi

#artificialintelligence

The real world poses challenges like having limited data and having tiny hardware like Mobile Phones and Raspberry Pis which can't run complex Deep Learning models. This post demonstrates how you can do object detection using a Raspberry Pi.